smooth fictitious play
- North America > United States > Massachusetts > Middlesex County > Cambridge (0.14)
- Europe > France > Auvergne-Rhône-Alpes > Isère > Grenoble (0.04)
- Asia > Japan > Honshū > Chūgoku > Hiroshima Prefecture > Hiroshima (0.04)
- (5 more...)
Smooth Fictitious Play in Stochastic Games with Perturbed Payoffs and Unknown Transitions
Recent extensions to dynamic games of the well known fictitious play learning procedure in static games were proved to globally converge to stationary Nash equilibria in two important classes of dynamic games (zero-sum and identical-interest discounted stochastic games). However, those decentralized algorithms need the players to know exactly the model (the transition probabilities and their payoffs at every stage). To overcome these strong assumptions, our paper introduces regularizations of the recent algorithms which are moreover, model-free (players don't know the transitions and their payoffs are perturbed at every stage). Our novel procedures can be interpreted as extensions to stochastic games of the classical smooth fictitious play learning procedures in static games (where players best responses are regularized, thanks to a smooth perturbation of their payoff functions). We prove the convergence of our family of procedures to stationary regularized Nash equilibria in the same classes of dynamic games (zero-sum and identical interests discounted stochastic games). The proof uses the continuous smooth best-response dynamics counterparts, and stochastic approximation methods.
- North America > United States > Massachusetts > Middlesex County > Cambridge (0.14)
- Europe > France > Auvergne-Rhône-Alpes > Isère > Grenoble (0.04)
- Asia > Japan > Honshū > Chūgoku > Hiroshima Prefecture > Hiroshima (0.04)
- (6 more...)
- Information Technology > Game Theory (1.00)
- Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.73)
- Information Technology > Artificial Intelligence > Machine Learning > Computational Learning Theory (0.46)
Smooth Fictitious Play in Stochastic Games with Perturbed Payoffs and Unknown Transitions
Recent extensions to dynamic games of the well known fictitious play learning procedure in static games were proved to globally converge to stationary Nash equilibria in two important classes of dynamic games (zero-sum and identical-interest discounted stochastic games). However, those decentralized algorithms need the players to know exactly the model (the transition probabilities and their payoffs at every stage). To overcome these strong assumptions, our paper introduces regularizations of the recent algorithms which are moreover, model-free (players don't know the transitions and their payoffs are perturbed at every stage). Our novel procedures can be interpreted as extensions to stochastic games of the classical smooth fictitious play learning procedures in static games (where players best responses are regularized, thanks to a smooth perturbation of their payoff functions). We prove the convergence of our family of procedures to stationary regularized Nash equilibria in the same classes of dynamic games (zero-sum and identical interests discounted stochastic games).
Mastering Strategy Card Game (Legends of Code and Magic) via End-to-End Policy and Optimistic Smooth Fictitious Play
Xi, Wei, Zhang, Yongxin, Xiao, Changnan, Huang, Xuefeng, Deng, Shihong, Liang, Haowei, Chen, Jie, Sun, Peng
Deep Reinforcement Learning combined with Fictitious Play shows impressive results on many benchmark games, most of which are, however, single-stage. In contrast, real-world decision making problems may consist of multiple stages, where the observation spaces and the action spaces can be completely different across stages. We study a two-stage strategy card game Legends of Code and Magic and propose an end-to-end policy to address the difficulties that arise in multi-stage game. We also propose an optimistic smooth fictitious play algorithm to find the Nash Equilibrium for the two-player game. Our approach wins double championships of COG2022 competition. Extensive studies verify and show the advancement of our approach.